our K-means models for a toy data of three clusters. A, B and C stand for three

labels. The circle, the triangles and the diamonds stand for the resulting cluster

ned by the K-means models.

Four K-means cluster models constructed for the amino acid data using two,

and five clusters. Different clusters are represented by different enclosing

the amino acids.

next issue associated with the K-means algorithm is how to

e a proper cluster number for a data set. This is because an

cluster number is critically important to the performance of a

ed cluster model for a data set. In the kmeans function, the

r centers is used for the user to design a cluster number. For a

d data set, no a priori knowledge about the exact cluster number